23 research outputs found
The Dafny Integrated Development Environment
In recent years, program verifiers and interactive theorem provers have
become more powerful and more suitable for verifying large programs or proofs.
This has demonstrated the need for improving the user experience of these tools
to increase productivity and to make them more accessible to non-experts. This
paper presents an integrated development environment for Dafny-a programming
language, verifier, and proof assistant-that addresses issues present in most
state-of-the-art verifiers: low responsiveness and lack of support for
understanding non-obvious verification failures. The paper demonstrates several
new features that move the state-of-the-art closer towards a verification
environment that can provide verification feedback as the user types and can
present more helpful information about the program or failed verifications in a
demand-driven and unobtrusive way.Comment: In Proceedings F-IDE 2014, arXiv:1404.578
Automatically Testing Functional Properties of Code Translation Models
Large language models are becoming increasingly practical for translating
code across programming languages, a process known as . Even
though automated transpilation significantly boosts developer productivity, a
key concern is whether the generated code is correct. Existing work initially
used manually crafted test suites to test the translations of a small corpus of
programs; these test suites were later automated. In contrast, we devise the
first approach for automated, functional, property-based testing of code
translation models. Our general, user-provided specifications about the
transpiled code capture a range of properties, from purely syntactic to purely
semantic ones. As shown by our experiments, this approach is very effective in
detecting property violations in popular code translation models, and
therefore, in evaluating model quality with respect to given properties. We
also go a step further and explore the usage scenario where a user simply aims
to obtain a correct translation of some code with respect to certain properties
without necessarily being concerned about the overall quality of the model. To
this purpose, we develop the first property-guided search procedure for code
translation models, where a model is repeatedly queried with slightly different
parameters to produce alternative and potentially more correct translations.
Our results show that this search procedure helps to obtain significantly
better code translations.Comment: 13 pages including appendix and reference
Perfectly Parallel Fairness Certification of Neural Networks
Recently, there is growing concern that machine-learning models, which currently assist or even automate decision making, reproduce, and in the worst case reinforce, bias of the training data. The development of tools and techniques for certifying fairness of these models or describing their biased behavior is, therefore, critical. In this paper, we propose a perfectly parallel static analysis for certifying causal fairness of feed-forward neural networks used for classification tasks. When certification succeeds, our approach provides definite guarantees, otherwise, it describes and quantifies the biased behavior. We design the analysis to be sound, in practice also exact, and configurable in terms of scalability and precision, thereby enabling pay-as-you-go certification. We implement our approach in an open-source tool and demonstrate its effectiveness on models trained with popular datasets
Perfectly Parallel Fairness Certification of Neural Networks
International audienceRecently, there is growing concern that machine-learned software, which currently assists or even automates decision making, reproduces, and in the worst case reinforces, bias present in the training data. The development of tools and techniques for certifying fairness of this software or describing its biases is, therefore, critical. In this paper, we propose a perfectly parallel static analysis for certifying fairness of feed-forward neural networks used for classification of tabular data. When certification succeeds, our approach provides definite guarantees, otherwise, it describes and quantifies the biased input space regions. We design the analysis to be sound, in practice also exact, and configurable in terms of scalability and precision, thereby enabling pay-as-you-go certification. We implement our approach in an open-source tool called libra and demonstrate its effectiveness on neural networks trained on popular datasets